Evaluation of Statistical Estimation Methods for Lognormally Distributed Variables
نویسندگان
چکیده
Distributions of many chemical, physical, and microbiological properties of soils appear to be lognormal. Several conflicting recommendations exist in the soil science and statistical literature on how to best estimate the population mean, variance, and coefficient of variation of lognormally distributed data. We chose to determine with statistical certainty which of the following three methods is best: (i) the method of moments (method 1); (ii) maximum likelihood (method 2); and (iii) Finney's method (method 3). We assessed the efficacy of these three methods for estimating the mean, variance, and coefficient of variation of lognormal data in the range of sample sizes from n = 4 to 100. Three test lognormal populations were used in our evaluation with coefficients of variation that span the range seen for many soil variables (CVs of 50%, 100%, and 200%). We found Finney's method was best for estimating the mean and variance of lognormal data when the coefficient of variation of the underlying lognormal frequency distribution exceeds 100%, below this value the extra computational effort required to implement Finney's technique buys little, relative to the method of moments. Finney's method has not been previously applied by soil scientists, but its superiority over maximum likelihood suggests that the latter should not be generally recommended for estimating the mean, variance and coefficient of variation of lognormal data. Additional Index Words: Lognormal, Mean square error, Bias, Efficiency, Soil variables, Monte Carlo simulation. T VARIABILITY of soil properties has received increased interest (Nielsen & Bouma, 1985). The combination of low cost computers and automated analysis systems has enabled scientists to generate large databases for particular soil variables. These large daT.B. Parkin, J.J. Meisinger, and J.L. Starr, USDA-ARS, BARC. Soil Nitrogen and Environmental Chemistry Lab., Beltsville, MD 20705; S.T. Chester and J.A. Robinson, The Upjohn Company, Kalamazoo, MI 49001. Contribution of the USDA-ARS and The Upjohn Company. Received 29 May 1987. "Corresponding author. Published in Soil Sci. Soc. Am. J. 52:323-329 (1988). tabases have, in turn, allowed the characterization of the variability and frequency distributions of soil variables. Such analyses indicate that the frequency distributions of many physical, chemical, and microbiological soil properties are skewed to the right and are better approximated by the lognormal frequency distribution than by the normal (Gaussian) probability density function (Table 1). Confusion exists on how to best estimate the mean, variance, and coefficient of variation of lognormally distributed data. This stems from the fact that several statistical procedures for estimating these population parameters have appeared in soil science and statistical literature. (Warrick & Nielsen, 1980; Koch & Link, 1970). A statistically complete evaluation of the most commonly applied methods has not been published in a source accessible to the majority of soil scientists. We undertook the present study to determine the statistical efficacy of three methods (method of moments, maximum likelihood, and Finney's approximation) for estimating the population mean, variance, and coefficient of variation of lognormal data. Of these three, method 2 (maximum likelihood) has been recommended for use by some soil scientists (Warrick & Nielsen, 1980;Folurenso&Rolston, 1984; Parkin et al. 1985), although, as we will show, it is generally inferior to Finney's method (method 3) as well as the more commonly applied method of moments (method 1). Finney's method for estimating the mean, variance, and coefficient of variation of lognormal data has only rarely been applied to soils data. (White et al., 1987; Parkin, 1987; Parkin et al. 1987). The present study is a theoretical one in the sense that we investigated the properties of the three estimation methods in a "world" where the answers were known. Description of any new statistical estimation method often incorporates an evaluation of the method for a family of known probability density functions that cover the range of distributions seen or expected 324 SOIL SCI. SOC. AM. J., VOL. 52, 1988 Table 1. Abbreviated survey of soil variables which have been reported to be approximately lognormally distributed.
منابع مشابه
Calculating Confidence Intervals for the Mean of a Lognormally Distributed Variable
Many physical, chemical, and biological properties of soils exhibit skewed distributions that can be approximated by the two-parameter lognormal distribution. Recent attention in the soils literature has focused on the best method of estimating the mean for lognormally distributed variables; however, little attention has been given to methods for constructing confidence intervals about the mean...
متن کاملAsymptotic Behavior of Tail Density for Sum of Correlated Lognormal Variables
We consider the asymptotic behavior of a probability density function for the sum of any two lognormally distributed random variables that are nontrivially correlated. We show that both the left and right tails can be approximated by some simple functions. Furthermore, the same techniques are applied to determine the tail probability density function for a ratio statistic, and for a sum with mo...
متن کاملStatistical Evaluation of Median Estimators for Lognormally Distributed Variables
The increased interest in the variability of soil properties is responsible for recent observations that soil variables are not normally distributed but are more closely approximated by the two-parameter lognormal frequency distribution. Statistical methods commonly applied in the estimation of the median of lognormally distributed data, however, are biased or inefficient. The purpose of this s...
متن کاملStatistical analysis of co-channel interference in wireless communications systems
Co-channel interference is recognized as one of the major factors that limits the capacity and link quality of a wireless communications system. An appropriate understanding of the statistical behavior of the co-channel interference is therefore required when analyzing and designing techniques that mitigate its undesired effects. The total co-channel interference in a wireless communications sy...
متن کاملAn Evaluation of Normal Versus Lognormal Distribution in Data Description and Empirical Analysis. Practical Assessment, Research & Evaluation
Many existing methods of statistical inference and analysis rely heavily on the assumption that the data are normally distributed. However, the normality assumption is not fulfilled when dealing with data which does not contain negative values or are otherwise skewed – a common occurrence in diverse disciplines such as finance, economics, political science, sociology, philology, biology and phy...
متن کامل